Automated mapping of large-scale chromatin structure in ENCODE
نویسندگان
چکیده
MOTIVATION A recently developed DNaseI assay has given us our first genome-wide view of chromatin structure. In addition to cataloging DNaseI hypersensitive sites, these data allows us to more completely characterize overall features of chromatin accessibility. We employed a Bayesian hierarchical change-point model (CPM), a generalization of a hidden Markov Model (HMM), to characterize tiled microarray DNaseI sensitivity data available from the ENCODE project. RESULTS Our analysis shows that the accessibility of chromatin to cleavage by DNaseI is well described by a four state model of local segments with each state described by a continuous mixture of Gaussian variables. The CPM produces a better fit to the observed data than the HMM. The large posterior probability for the four-state CPM suggests that the data falls naturally into four classes of regions, which we call major and minor DNaseI hypersensitive sites (DHSs), regions of intermediate sensitivity, and insensitive regions. These classes agree well with a model of chromatin in which local disruptions (DHSs) are concentrated within larger domains of intermediate sensitivity, the accessibility islands. The CPM assigns 92% of the bases within the ENCODE regions to the insensitive regions. The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements. AVAILABILITY The CPM software is available upon request from the authors.
منابع مشابه
Independent and complementary methods for large-scale structural analysis of mammalian chromatin.
The fundamental building block of chromatin, the nucleosome, occupies 150 bp of DNA in a spaced arrangement that is a primary determinant in regulation of the genome. The nucleosomal organization of some regions of the human genome has been described, but mapping of these regions has been limited to a few kilobases. We have explored two independent and complementary methods for the high-through...
متن کاملFully automated high-throughput chromatin immunoprecipitation for ChIP-seq: Identifying ChIP-quality p300 monoclonal antibodies
Chromatin immunoprecipitation coupled with DNA sequencing (ChIP-seq) is the major contemporary method for mapping in vivo protein-DNA interactions in the genome. It identifies sites of transcription factor, cofactor and RNA polymerase occupancy, as well as the distribution of histone marks. Consortia such as the ENCyclopedia Of DNA Elements (ENCODE) have produced large datasets using manual pro...
متن کاملIdentification of Gene Positioning Factors Using High-Throughput Imaging Mapping
Genomes are arranged non-randomly in the 3D space of the cell nucleus. Here, we have developed HIPMap, a high-precision, high-throughput, automated fluorescent in situ hybridization imaging pipeline, for mapping of the spatial location of genome regions at large scale. High-throughput imaging position mapping (HIPMap) enabled an unbiased siRNA screen for factors involved in genome organization ...
متن کاملModification of enhancer chromatin: what, how, and why?
Emergence of form and function during embryogenesis arises in large part through cell-type- and cell-state-specific variation in gene expression patterns, mediated by specialized cis-regulatory elements called enhancers. Recent large-scale epigenomic mapping revealed unexpected complexity and dynamics of enhancer utilization patterns, with 400,000 putative human enhancers annotated by the ENCOD...
متن کاملAutomated microscopy identifies estrogen receptor subdomains with large-scale chromatin structure unfolding activity.
BACKGROUND Recently, several transcription factors were found to possess large-scale chromatin unfolding activity; these include the VP16 acidic activation domain, BRCA1, E2F1, p53, and the glucocorticoid and estrogen steroid receptors. In these studies, proteins were fluorescently labeled and targeted to a multimerized array of DNA sequences in mammalian cultured cells, and changes in the appe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 24 شماره
صفحات -
تاریخ انتشار 2008